Skip to content

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Jan 4, 2026

Motivation

In asymmetric scalar quantization, the storage vectors are quantized to uint8_t while query vectors remain as floats. The distance formulas require precomputed sums from the query vector:

  • IP/Cosine: IP(x, y) = min * Σy_i + delta * Σ(q_i * y_i) — requires y_sum = Σy_i
  • L2: ||x - y||² = x_sum_squares - 2 * IP(x, y) + y_sum_squares — requires y_sum_squares = Σy_i²

Previously, these values would need to be computed during each distance calculation. By precomputing them during query preprocessing, we move this cost to a one-time operation per query.


New Functionality

QuantPreprocessor::preprocessQuery now:

  1. Allocates a query blob of size (dim + 1) * sizeof(DataType)
  2. Copies the original query values
  3. Appends the precomputed value based on metric:
    • IP/Cosine: y_sum = Σy_i
    • L2: y_sum_squares = Σy_i²

Query blob layout:

| query_values[dim] | y_sum (IP/Cosine) OR y_sum_squares (L2) |

Usage in Distance Calculator

The asymmetric distance function can now retrieve the precomputed value directly from the query blob at offset dim, avoiding redundant summation during each distance computation.

@meiravgri meiravgri changed the title fix tests [MOD-13325] Add query preprocessing to QuantPreprocessor for asymmetric distance computation Jan 4, 2026
@meiravgri meiravgri changed the title [MOD-13325] Add query preprocessing to QuantPreprocessor for asymmetric distance computation [MOD-13325] Add query preprocessing to QuantPreprocessor for asymmetric distance computation Jan 4, 2026
@meiravgri meiravgri requested a review from dor-forer January 4, 2026 12:41
@codecov
Copy link

codecov bot commented Jan 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.05%. Comparing base (f404ee2) to head (cec0292).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #876      +/-   ##
==========================================
- Coverage   97.06%   97.05%   -0.01%     
==========================================
  Files         126      126              
  Lines        7458     7512      +54     
==========================================
+ Hits         7239     7291      +52     
- Misses        219      221       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}
}

DataType sum_fast(const DataType *p) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use it in the quantize function in some way? Or quantize doesn't need fast calculations like the query.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we always need faster calculations!
good catch

@meiravgri meiravgri requested a review from dor-forer January 4, 2026 17:30
@meiravgri meiravgri added this pull request to the merge queue Jan 4, 2026
Merged via the queue into main with commit 4925862 Jan 4, 2026
18 checks passed
@meiravgri meiravgri deleted the meiravg_precomput_query_for_sq branch January 4, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants